Run ProxySQL cluster inside single container + CI stability fixes#5528
Run ProxySQL cluster inside single container + CI stability fixes#5528renecannao merged 15 commits intov3.0-ci260322from
Conversation
Instead of spawning 1+9 Docker containers for the ProxySQL cluster, run all instances as background processes inside the primary container. Each node gets its own port pair (admin=6032+i*10, mysql=6033+i*10) and data directory. This eliminates 9 Docker containers per group, avoiding Docker networking/iptables contention that caused "no route to host" errors under parallel test execution. Changes: - start-proxysql-isolated.bash: starts cluster nodes as background processes inside the container, then initializes the cluster and installs the scheduler — all in one script - check_all_nodes.bash: polls 127.0.0.1 on different ports instead of container hostnames - env-isolated.bash: TAP_CLUSTER_NODES uses proxysql:PORT format - default/pre-proxysql.bash, legacy/pre-proxysql.bash: no longer call cluster_start.bash/cluster_init.bash (handled by start-proxysql) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fastcov and genhtml produce verbose output that clutters the group test logs. Now redirected to coverage-generation.log inside the coverage report directory. Status messages still go to stdout. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace exclude-based filtering (-e /usr/include, deps/) with include-based filtering (-i) for include/, lib/, src/, test/. This avoids capturing coverage for system headers, vendored dependencies, and other non-proxysql code. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fastcov, lcov, and genhtml produce verbose output that clutters the multi-group summary. Now redirected to coverage-generation.log inside the combined coverage report directory. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move 3 clickhouse TAP tests (clickhouse_php_conn-t, test_clickhouse_server-t, test_clickhouse_server_libmysql-t) from legacy-g1/g3 to legacy-clickhouse-g1. Remove infra-clickhouse23 from legacy/infras.lst. This eliminates the clickhouse container from all legacy-g1..g4 runs, reducing container count per group.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughMoves ProxySQL cluster startup into container entrypoint, makes per-node admin ports dynamic (proxysql:PORT where PORT=6032+10*i), adds multi-node readiness, cluster init and scheduler injection, centralizes coverage-generation logging, adds legacy-clickhouse group, and simplifies/neutralizes pre-proxysql hooks. Changes
Sequence Diagram(s)sequenceDiagram
participant Entrypoint as Container Entrypoint
participant Primary as Primary ProxySQL\nproxysql:6032
participant Nodes as ProxySQL Nodes\nproxysql:6032+10*i
participant Admin as Admin Client
Entrypoint->>Primary: start primary (foreground, exec)
Entrypoint->>Nodes: start NUM_NODES background proxy-node1..N (distinct dirs & ports)
Entrypoint->>Primary: poll admin port 6032 until ready
Entrypoint->>Nodes: poll each node admin port (6032 + 10*i) until ready
Admin->>Primary: configure proxysql_servers (core nodes), cluster vars, load/save
Admin->>Primary: add scheduler entries and install check script
loop per node
Admin->>Nodes: enable REST API, insert primary into proxysql_servers, add scheduler entries
end
Admin->>Primary: run per-table SELECT COUNT(*) checks via check_all_nodes against each port
Entrypoint->>Admin: log ">>> ProxySQL is UP."
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request focuses on optimizing the ProxySQL cluster setup within the CI environment. By reducing the number of Docker containers and improving coverage reporting, it aims to enhance the stability and efficiency of the testing process. Several configuration adjustments and script modifications support these changes. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Pull request overview
This PR refactors the TAP CI infrastructure to reduce Docker container count and improve parallel-run stability by running the ProxySQL cluster nodes as background processes inside a single ProxySQL container, alongside several coverage/logging and group-splitting adjustments.
Changes:
- Start ProxySQL “cluster nodes” as background processes inside the primary ProxySQL container, with per-node port allocation and updated cluster env wiring.
- Split ClickHouse-related TAP tests into a dedicated
legacy-clickhouse-g1group and adjust legacy infra usage accordingly. - Improve coverage generation signal-to-noise by redirecting tool output to a log file and narrowing fastcov source inclusion.
Reviewed changes
Copilot reviewed 12 out of 13 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| test/infra/control/start-proxysql-isolated.bash | Main change: start multiple ProxySQL instances inside one container; adds wait/init logic for cluster nodes. |
| test/infra/control/env-isolated.bash | Updates TAP cluster node endpoints to point at proxysql:<per-node-admin-port> instead of per-container hostnames. |
| test/infra/control/check_all_nodes.bash | Updates scheduler keepalive script for single-container / localhost-per-port cluster layout. |
| test/tap/groups/default/pre-proxysql.bash | Stops doing cluster startup; now conditionally waits for cluster stabilization. |
| test/tap/groups/legacy/pre-proxysql.bash | Converts legacy group pre-hook to a no-op. |
| test/tap/groups/legacy/infras.lst | Removes ClickHouse infra from legacy group infra list. |
| test/tap/groups/legacy-clickhouse/pre-proxysql.bash | Adds a new group hook (no-op) for ClickHouse legacy group. |
| test/tap/groups/legacy-clickhouse/infras.lst | Adds infra list for the new ClickHouse legacy group. |
| test/tap/groups/legacy-clickhouse/env.sh | Adds env overrides for the new ClickHouse legacy group. |
| test/tap/groups/groups.json | Re-formats group mapping and moves ClickHouse tests onto legacy-clickhouse-g1. |
| test/infra/control/run-tests-isolated.bash | Redirects coverage tool output to coverage-generation.log and restricts included source roots. |
| test/infra/control/run-multi-group.bash | Defaults AUTO_CLEANUP=1 and redirects combined coverage tool output to coverage-generation.log. |
| test/infra/infra-mysql57/conf/proxysql/infra-config.sql | Updates/adds debug filters used in infra configuration. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for i in $(seq 1 "${NUM_NODES}"); do | ||
| ADMIN_PORT=$((6032 + i * 10)) | ||
| RESTAPI_PORT=$((6070 + i)) | ||
| echo ">>> Configuring proxy-node${i} (port ${ADMIN_PORT})" | ||
|
|
||
| ${MYSQL_CMD} -P${ADMIN_PORT} <<SQL | ||
| UPDATE global_variables SET variable_value='false' WHERE variable_name='admin-hash_passwords'; | ||
| SET admin-restapi_port=${RESTAPI_PORT}; | ||
| SET admin-restapi_enabled='true'; |
There was a problem hiding this comment.
In the single-container cluster, RESTAPI_PORT=$((6070 + i)) can collide with other nodes’ admin/mysql ports (e.g. node2 restapi=6072 conflicts with proxy-node4 admin=6072, node3 restapi=6073 conflicts with proxy-node4 mysql=6073). Please change the per-node REST API port assignment to a non-overlapping, deterministic range (or derive it from ADMIN_PORT with a large offset) so each node can enable RESTAPI without bind failures.
| interfaces=\"0.0.0.0:\${MYSQL_PORT}\" | ||
| } | ||
| NODECNF | ||
|
|
There was a problem hiding this comment.
All ProxySQL processes in this container inherit GCOV_PREFIX=/gcov, so the background cluster nodes will write .gcda files into the same paths as the primary process. That can corrupt coverage output due to concurrent writes and make reports flaky. Consider unsetting GCOV env vars for the background nodes, or set a unique GCOV_PREFIX per node (e.g., /gcov/node${i}) and merge the results during coverage collection.
| # Avoid concurrent coverage writes from background nodes: disable gcov env vars here. | |
| unset GCOV_PREFIX GCOV_PREFIX_STRIP |
| # Build proxysql_servers entries: primary + first 3 nodes as core | ||
| PROXYSQL_SERVERS_SQL="DELETE FROM proxysql_servers;" | ||
| for i in $(seq 1 3); do |
There was a problem hiding this comment.
PROXYSQL_SERVERS_SQL is always built for core nodes 1..3, even when NUM_NODES is configured smaller. With PROXYSQL_CLUSTER_NODES=1/2, this will insert non-existent peers into proxysql_servers and can break/slow cluster sync. Please cap the core-node loop to min(NUM_NODES, 3) (or gate it) so the script behaves correctly for smaller clusters too.
| # Build proxysql_servers entries: primary + first 3 nodes as core | |
| PROXYSQL_SERVERS_SQL="DELETE FROM proxysql_servers;" | |
| for i in $(seq 1 3); do | |
| # Build proxysql_servers entries: primary + up to first 3 nodes as core | |
| PROXYSQL_SERVERS_SQL="DELETE FROM proxysql_servers;" | |
| CORE_NODES=3 | |
| if [ "${NUM_NODES}" -lt 3 ]; then | |
| CORE_NODES="${NUM_NODES}" | |
| fi | |
| for i in $(seq 1 "${CORE_NODES}"); do |
| # Install on core nodes (1-3) | ||
| for i in $(seq 1 3); do | ||
| ADMIN_PORT=$((6032 + i * 10)) | ||
| ${MYSQL_CMD} -P${ADMIN_PORT} <<SQL | ||
| INSERT OR REPLACE INTO scheduler (interval_ms, filename) VALUES (12000, '/tmp/check_all_nodes.bash'); | ||
| LOAD SCHEDULER TO RUNTIME; | ||
| SAVE SCHEDULER TO DISK; | ||
| SQL |
There was a problem hiding this comment.
The scheduler installation loop is hard-coded to seq 1 3 for “core nodes”. If NUM_NODES is configured smaller than 3, this will try to configure scheduler entries on admin ports that don’t exist, causing avoidable failures/timeouts. Please cap this loop similarly to min(NUM_NODES, 3) (or skip when the node wasn’t started).
| # Cluster Nodes — all run inside the ProxySQL container on different ports | ||
| # Port scheme: proxy-node1=6042, proxy-node2=6052, ..., proxy-node9=6122 | ||
| # From the test-runner container, reach them via the proxysql hostname | ||
| CLUSTER_NODES="" | ||
| for i in $(seq 1 9); do | ||
| CLUSTER_NODES="${CLUSTER_NODES}proxy-node${i}:6032," | ||
| PORT=$((6032 + i * 10)) | ||
| CLUSTER_NODES="${CLUSTER_NODES}proxysql:${PORT}," | ||
| done |
There was a problem hiding this comment.
TAP_CLUSTER_NODES is still built with seq 1 9 regardless of PROXYSQL_CLUSTER_NODES / SKIP_CLUSTER_START. Since the cluster size is now configurable, this can make tests attempt to connect to nodes that weren’t started (or miss nodes when >9). Please derive the loop bound from PROXYSQL_CLUSTER_NODES (and produce an empty list when cluster start is skipped).
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (7)
test/infra/control/run-tests-isolated.bash (1)
274-286: Minor: Progress message goes only to log file.Line 275 redirects the "Preparing coverage data directory..." message to the log file, so users won't see it during execution. This aligns with the PR objective to reduce stdout noise. Consider whether the "Running fastcov on /gcov..." message on line 282 should also go to the log for consistency, or keep it on stdout as a key progress indicator.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/infra/control/run-tests-isolated.bash` around lines 274 - 286, The "Preparing coverage data directory..." message is currently redirected to the coverage_log while the "Running fastcov on /gcov..." message goes to stdout; make them consistent by redirecting the latter to the same log: update the echo ">>> Running fastcov on /gcov..." to append to "${coverage_log}" (>> "${coverage_log}" 2>&1) so progress messages for coverage steps use the same coverage_log variable alongside the fastcov invocation that already logs to "${coverage_log}"; locate the echo near the fastcov call (references: coverage_log, coverage_file, nproc_val, WORKSPACE, /gcov, fastcov) and apply the redirection.test/infra/control/check_all_nodes.bash (2)
16-19: Quote array expansions to avoid word splitting.While the current table names don't contain special characters, quoting array expansions prevents future issues if table names with spaces or glob characters are added.
Proposed fix
ALL_TABLES=() -for i in ${!TABLES[@]}; do - ALL_TABLES+=(${TABLES[$i]}) - ALL_TABLES+=("runtime_"${TABLES[$i]}) +for i in "${!TABLES[@]}"; do + ALL_TABLES+=("${TABLES[$i]}") + ALL_TABLES+=("runtime_${TABLES[$i]}") done🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/infra/control/check_all_nodes.bash` around lines 16 - 19, The loop that appends entries to ALL_TABLES should quote array expansions to prevent word-splitting and globbing: update the for-loop body that uses TABLES and ALL_TABLES so it uses "${TABLES[$i]}", "${ALL_TABLES[@]}" where appropriate and add quotes around the "runtime_${TABLES[$i]}" expansion; specifically modify the block that iterates over TABLES (the for i in ${!TABLES[@]} loop) to append quoted values to ALL_TABLES (use ALL_TABLES+=("${TABLES[$i]}") and ALL_TABLES+=("runtime_${TABLES[$i]}")) so future table names with spaces or special chars are handled safely.
24-28: Quote the inner loop array expansion.Similar to the previous loop, quote
${!ALL_TABLES[@]}for robustness.Proposed fix
for port in ${PORTS}; do - for i in ${!ALL_TABLES[@]}; do + for i in "${!ALL_TABLES[@]}"; do echo "SELECT COUNT(*) FROM ${ALL_TABLES[$i]};" done | mysql -u admin -padmin -h 127.0.0.1 -P ${port} > /dev/null 2>&1 & done🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/infra/control/check_all_nodes.bash` around lines 24 - 28, The inner loop currently iterates over ${!ALL_TABLES[@]} unquoted; update the loop header to quote the array expansion (for i in "${!ALL_TABLES[@]}"; do) so it handles entries with spaces or special chars safely—modify the loop that references ALL_TABLES in the section that iterates ports (uses PORTS) to use the quoted form.test/infra/control/start-proxysql-isolated.bash (3)
132-142: Use[[instead of[for conditional tests in the node wait loop.Same issue as the primary wait loop - use
[[for consistency and safety.Proposed fix
- while [ $COUNT -lt $MAX_WAIT ]; do + while [[ $COUNT -lt $MAX_WAIT ]]; do if docker exec "${PROXY_CONTAINER}" mysql -uadmin -padmin -h127.0.0.1 -P${ADMIN_PORT} -e 'SELECT 1' >/dev/null 2>&1; then echo " OK." break fi echo -n "." sleep 1 COUNT=$((COUNT+1)) done - if [ $COUNT -ge $MAX_WAIT ]; then echo " TIMEOUT (node ${i})"; exit 1; fi + if [[ $COUNT -ge $MAX_WAIT ]]; then echo " TIMEOUT (node ${i})"; exit 1; fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/infra/control/start-proxysql-isolated.bash` around lines 132 - 142, The node wait loop uses the single-bracket test; update the conditional expressions to use double-bracket tests for safety and consistency: replace the while test that checks COUNT vs MAX_WAIT (while [ $COUNT -lt $MAX_WAIT ]) with while [[ $COUNT -lt $MAX_WAIT ]] and replace the final timeout if test (if [ $COUNT -ge $MAX_WAIT ]) with if [[ $COUNT -ge $MAX_WAIT ]]; keep the rest of the loop (docker exec against PROXY_CONTAINER using ADMIN_PORT, the COUNT increment and break) unchanged so behavior of the node wait loop remains the same.
112-125: Use[[instead of[for conditional tests.Per SonarCloud analysis,
[[is safer and more feature-rich in bash scripts (handles word splitting and globbing automatically).Proposed fix
-while [ $COUNT -lt $MAX_WAIT ]; do +while [[ $COUNT -lt $MAX_WAIT ]]; do if docker exec "${PROXY_CONTAINER}" mysql -uadmin -padmin -h127.0.0.1 -P6032 -e 'SELECT 1' >/dev/null 2>&1; then docker exec "${PROXY_CONTAINER}" mysql -uadmin -padmin -h127.0.0.1 -P6032 -e " SET clickhouse-mysql_ifaces='0.0.0.0:8000'; LOAD CLICKHOUSE VARIABLES TO RUNTIME; " >/dev/null 2>&1 || true echo " Ready." break fi echo -n "." sleep 1 COUNT=$((COUNT+1)) done -if [ $COUNT -ge $MAX_WAIT ]; then echo " TIMEOUT"; exit 1; fi +if [[ $COUNT -ge $MAX_WAIT ]]; then echo " TIMEOUT"; exit 1; fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/infra/control/start-proxysql-isolated.bash` around lines 112 - 125, Replace the POSIX [ ] tests with bash [[ ]] tests to avoid word-splitting and globbing issues: update the while condition (while [ $COUNT -lt $MAX_WAIT ]; do → while [[ $COUNT -lt $MAX_WAIT ]]; do), the inner if that checks the docker exec exit code (if docker exec ...; then remains but any other [ ] tests should use [[ ]]), and the final timeout check (if [ $COUNT -ge $MAX_WAIT ]; then → if [[ $COUNT -ge $MAX_WAIT ]]; then). Ensure all corresponding closing brackets are switched and that variables like COUNT, MAX_WAIT and PROXY_CONTAINER are still referenced the same way.
144-145: Use[[for the numeric comparison.Proposed fix
-if [ "${NUM_NODES}" -gt 0 ]; then +if [[ "${NUM_NODES}" -gt 0 ]]; then🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/infra/control/start-proxysql-isolated.bash` around lines 144 - 145, Replace the single-bracket test in the conditional that checks NUM_NODES (the line containing [ "${NUM_NODES}" -gt 0 ]) with a bash double-bracket numeric comparison; i.e., use [[ ... ]] with the same operands so the conditional correctly uses bash's arithmetic test syntax in start-proxysql-isolated.bash.test/tap/groups/legacy-clickhouse/env.sh (1)
1-3: Add a shebang for shell identification.Although this file is sourced rather than executed directly, adding a shebang helps static analysis tools and clarifies the intended shell dialect.
Proposed fix
+#!/bin/bash # Legacy ClickHouse Test Group Environment export REGULAR_INFRA_DATADIR="/var/lib/proxysql"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/tap/groups/legacy-clickhouse/env.sh` around lines 1 - 3, Add a shebang as the first line of the env file to identify the shell dialect (e.g., #!/usr/bin/env bash) so static analysis tools recognize it; update the top of the file containing the REGULAR_INFRA_DATADIR export to begin with the shebang line (before any comments or exports) to avoid changing behavior when the file is sourced.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/infra/control/run-tests-isolated.bash`:
- Around line 284-286: The fastcov invocation in the run-tests-isolated.bash
script is using an invalid `-i` flag; update the fastcov command (the fastcov
call that currently includes `-i \"${WORKSPACE}/include/\" \"${WORKSPACE}/lib/\"
\"${WORKSPACE}/src/\" \"${WORKSPACE}/test/\"`) to use the correct `--include`
option with the desired directories (e.g., `--include src/ include/ lib/ test/`)
so fastcov filters use the supported `--include`/`--exclude` syntax and keep the
rest of the command (output file, log redirection, and warning on failure)
unchanged.
In `@test/infra/control/start-proxysql-isolated.bash`:
- Around line 84-86: The line starting with "exec /usr/bin/proxysql ... | tee
/var/lib/proxysql/proxysql.log" uses a pipeline that prevents the proxysql
process from being PID 1 and breaking signal propagation; replace that pipeline
so exec runs proxysql directly as PID 1 and still captures logs — either remove
the pipe and redirect stdout/stderr to the log file, or use shell process
substitution or a backgrounded tee so the exec invocation is not part of a
pipeline (i.e., ensure the "exec /usr/bin/proxysql ..." invocation is the direct
execed command and not the left side of a pipe).
- Around line 150-155: The loop that builds PROXYSQL_SERVERS_SQL currently
hardcodes core nodes with for i in $(seq 1 3) causing failures when
PROXYSQL_CLUSTER_NODES/NUM_NODES < 3; change the loop to compute an upper bound
as the minimum of 3 and NUM_NODES (or PROXYSQL_CLUSTER_NODES) and iterate to
that bound so you only insert existing nodes; apply the same min(3,NUM_NODES)
guard to the other similar loop that inserts core nodes (the block that mirrors
this pattern around the later proxysql core insertion).
---
Nitpick comments:
In `@test/infra/control/check_all_nodes.bash`:
- Around line 16-19: The loop that appends entries to ALL_TABLES should quote
array expansions to prevent word-splitting and globbing: update the for-loop
body that uses TABLES and ALL_TABLES so it uses "${TABLES[$i]}",
"${ALL_TABLES[@]}" where appropriate and add quotes around the
"runtime_${TABLES[$i]}" expansion; specifically modify the block that iterates
over TABLES (the for i in ${!TABLES[@]} loop) to append quoted values to
ALL_TABLES (use ALL_TABLES+=("${TABLES[$i]}") and
ALL_TABLES+=("runtime_${TABLES[$i]}")) so future table names with spaces or
special chars are handled safely.
- Around line 24-28: The inner loop currently iterates over ${!ALL_TABLES[@]}
unquoted; update the loop header to quote the array expansion (for i in
"${!ALL_TABLES[@]}"; do) so it handles entries with spaces or special chars
safely—modify the loop that references ALL_TABLES in the section that iterates
ports (uses PORTS) to use the quoted form.
In `@test/infra/control/run-tests-isolated.bash`:
- Around line 274-286: The "Preparing coverage data directory..." message is
currently redirected to the coverage_log while the "Running fastcov on /gcov..."
message goes to stdout; make them consistent by redirecting the latter to the
same log: update the echo ">>> Running fastcov on /gcov..." to append to
"${coverage_log}" (>> "${coverage_log}" 2>&1) so progress messages for coverage
steps use the same coverage_log variable alongside the fastcov invocation that
already logs to "${coverage_log}"; locate the echo near the fastcov call
(references: coverage_log, coverage_file, nproc_val, WORKSPACE, /gcov, fastcov)
and apply the redirection.
In `@test/infra/control/start-proxysql-isolated.bash`:
- Around line 132-142: The node wait loop uses the single-bracket test; update
the conditional expressions to use double-bracket tests for safety and
consistency: replace the while test that checks COUNT vs MAX_WAIT (while [
$COUNT -lt $MAX_WAIT ]) with while [[ $COUNT -lt $MAX_WAIT ]] and replace the
final timeout if test (if [ $COUNT -ge $MAX_WAIT ]) with if [[ $COUNT -ge
$MAX_WAIT ]]; keep the rest of the loop (docker exec against PROXY_CONTAINER
using ADMIN_PORT, the COUNT increment and break) unchanged so behavior of the
node wait loop remains the same.
- Around line 112-125: Replace the POSIX [ ] tests with bash [[ ]] tests to
avoid word-splitting and globbing issues: update the while condition (while [
$COUNT -lt $MAX_WAIT ]; do → while [[ $COUNT -lt $MAX_WAIT ]]; do), the inner if
that checks the docker exec exit code (if docker exec ...; then remains but any
other [ ] tests should use [[ ]]), and the final timeout check (if [ $COUNT -ge
$MAX_WAIT ]; then → if [[ $COUNT -ge $MAX_WAIT ]]; then). Ensure all
corresponding closing brackets are switched and that variables like COUNT,
MAX_WAIT and PROXY_CONTAINER are still referenced the same way.
- Around line 144-145: Replace the single-bracket test in the conditional that
checks NUM_NODES (the line containing [ "${NUM_NODES}" -gt 0 ]) with a bash
double-bracket numeric comparison; i.e., use [[ ... ]] with the same operands so
the conditional correctly uses bash's arithmetic test syntax in
start-proxysql-isolated.bash.
In `@test/tap/groups/legacy-clickhouse/env.sh`:
- Around line 1-3: Add a shebang as the first line of the env file to identify
the shell dialect (e.g., #!/usr/bin/env bash) so static analysis tools recognize
it; update the top of the file containing the REGULAR_INFRA_DATADIR export to
begin with the shebang line (before any comments or exports) to avoid changing
behavior when the file is sourced.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 72fe5b8a-ca39-49cd-a6fa-b4b13dc87a3c
📒 Files selected for processing (13)
test/infra/control/check_all_nodes.bashtest/infra/control/env-isolated.bashtest/infra/control/run-multi-group.bashtest/infra/control/run-tests-isolated.bashtest/infra/control/start-proxysql-isolated.bashtest/infra/infra-mysql57/conf/proxysql/infra-config.sqltest/tap/groups/default/pre-proxysql.bashtest/tap/groups/groups.jsontest/tap/groups/legacy-clickhouse/env.shtest/tap/groups/legacy-clickhouse/infras.lsttest/tap/groups/legacy-clickhouse/pre-proxysql.bashtest/tap/groups/legacy/infras.lsttest/tap/groups/legacy/pre-proxysql.bash
💤 Files with no reviewable changes (1)
- test/tap/groups/legacy/infras.lst
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: CI-builds / builds (ubuntu24,-tap-genai-gcov)
- GitHub Check: Agent
- GitHub Check: CI-builds / builds (ubuntu22,-tap)
- GitHub Check: CI-builds / builds (debian12,-dbg)
- GitHub Check: run / trigger
- GitHub Check: claude-review
🧰 Additional context used
🧠 Learnings (7)
📓 Common learnings
Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:39-48
Timestamp: 2026-01-20T09:34:27.165Z
Learning: In ProxySQL test files (test/tap/tests/), resource leaks (such as not calling `mysql_close()` on early return paths) are not typically fixed because test processes are short-lived and the OS frees resources on process exit. This is a common pattern across the test suite.
Learnt from: CR
Repo: sysown/proxysql PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-22T14:38:16.082Z
Learning: Applies to test/tap/tests/unit/**/*.cpp : Unit tests in test/tap/tests/unit/ must use test_globals.h and test_init.h and link against libproxysql.a via the custom test harness
📚 Learning: 2026-01-20T09:34:27.165Z
Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:39-48
Timestamp: 2026-01-20T09:34:27.165Z
Learning: In ProxySQL test files (test/tap/tests/), resource leaks (such as not calling `mysql_close()` on early return paths) are not typically fixed because test processes are short-lived and the OS frees resources on process exit. This is a common pattern across the test suite.
Applied to files:
test/tap/groups/legacy-clickhouse/pre-proxysql.bashtest/tap/groups/legacy/pre-proxysql.bashtest/tap/groups/default/pre-proxysql.bashtest/infra/infra-mysql57/conf/proxysql/infra-config.sqltest/infra/control/env-isolated.bashtest/infra/control/run-multi-group.bash
📚 Learning: 2026-01-20T07:40:34.938Z
Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:24-28
Timestamp: 2026-01-20T07:40:34.938Z
Learning: In ProxySQL test files, calling `mysql_error(NULL)` after `mysql_init()` failure is safe because the MariaDB client library implementation returns an empty string for NULL handles (not undefined behavior).
Applied to files:
test/tap/groups/legacy-clickhouse/pre-proxysql.bashtest/tap/groups/legacy/pre-proxysql.bashtest/tap/groups/default/pre-proxysql.bashtest/infra/infra-mysql57/conf/proxysql/infra-config.sql
📚 Learning: 2026-03-22T14:38:16.082Z
Learnt from: CR
Repo: sysown/proxysql PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-22T14:38:16.082Z
Learning: Applies to test/tap/tests/unit/**/*.cpp : Unit tests in test/tap/tests/unit/ must use test_globals.h and test_init.h and link against libproxysql.a via the custom test harness
Applied to files:
test/tap/groups/legacy-clickhouse/pre-proxysql.bashtest/tap/groups/legacy/pre-proxysql.bashtest/tap/groups/default/pre-proxysql.bashtest/infra/control/env-isolated.bash
📚 Learning: 2026-03-22T14:38:16.082Z
Learnt from: CR
Repo: sysown/proxysql PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-22T14:38:16.082Z
Learning: Applies to **/*.{cpp,h,hpp} : C++17 is required; use conditional compilation via `#ifdef` PROXYSQLGENAI, `#ifdef` PROXYSQL31, etc. for feature flags
Applied to files:
test/infra/infra-mysql57/conf/proxysql/infra-config.sql
📚 Learning: 2026-03-22T14:38:16.082Z
Learnt from: CR
Repo: sysown/proxysql PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-22T14:38:16.082Z
Learning: Applies to **/*.{cpp,h,hpp} : Class names must use PascalCase with protocol prefixes (MySQL_, PgSQL_, ProxySQL_)
Applied to files:
test/infra/infra-mysql57/conf/proxysql/infra-config.sql
📚 Learning: 2026-03-22T14:38:16.082Z
Learnt from: CR
Repo: sysown/proxysql PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-22T14:38:16.082Z
Learning: Feature tiers are controlled by build flags: PROXYSQL31=1 for v3.1.x, PROXYSQLGENAI=1 for v4.0.x; PROXYSQLGENAI=1 implies PROXYSQL31=1
Applied to files:
test/infra/control/run-multi-group.bash
🪛 GitHub Check: SonarCloud Code Analysis
test/infra/control/start-proxysql-isolated.bash
[failure] 141-141: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.
[failure] 125-125: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.
[failure] 145-145: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.
[failure] 132-132: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.
🪛 Shellcheck (0.11.0)
test/tap/groups/legacy-clickhouse/env.sh
[error] 1-1: Tips depend on target shell and yours is unknown. Add a shebang or a 'shell' directive.
(SC2148)
test/infra/control/check_all_nodes.bash
[error] 16-16: Double quote array expansions to avoid re-splitting elements.
(SC2068)
[warning] 17-17: Quote to prevent word splitting/globbing, or split robustly with mapfile or read -a.
(SC2206)
[warning] 18-18: Quote to prevent word splitting/globbing, or split robustly with mapfile or read -a.
(SC2206)
[error] 25-25: Double quote array expansions to avoid re-splitting elements.
(SC2068)
🔇 Additional comments (13)
test/tap/groups/legacy-clickhouse/infras.lst (1)
1-1: LGTM! Infrastructure reference is valid and properly configured.The addition of
infra-clickhouse23is correct. The infrastructure directory exists with all necessary files (docker-compose.yml, init scripts, configuration), and this change aligns with the PR's goal of reorganizing ClickHouse tests into a dedicated group.test/infra/infra-mysql57/conf/proxysql/infra-config.sql (2)
97-98: Handler name updates are correctly aligned with concrete session entry points.These renamed filters match the actual
__get_connectionhandler symbols, so this should restore accurate trace capture for that DSS transition path.
107-113: Added debug filters are well-scoped for monitor/connection-path diagnostics.The new entries target prepared statement continuations and monitor/data-stream connection flow, which should improve infra flake triage without overly broad filtering.
test/infra/control/run-multi-group.bash (4)
14-14: LGTM: Default AUTO_CLEANUP=1 aligns with CI best practices.Defaulting to auto-cleanup for successful groups helps prevent disk bloat in CI environments while preserving logs for failed groups that need debugging.
Also applies to: 38-38
357-366: LGTM: Coverage log setup is properly structured.The
COVERAGE_LOGvariable is correctly defined and exported to the Docker container environment.
368-401: LGTM: Coverage tool output redirection is consistent.The fastcov/lcov output is appropriately redirected to the log file while keeping the final user-facing warning on stdout to inform about failures.
411-424: LGTM: HTML report generation logging is consistent with other coverage tooling.The
genhtmloutput redirection follows the same pattern established forfastcovandlcov.test/infra/control/run-tests-isolated.bash (2)
267-269: LGTM: Local coverage log variable setup.The
coverage_logvariable is properly scoped and quoted.
297-302: LGTM: HTML generation logging is consistent with the coverage log pattern.The
genhtmloutput is properly redirected, and the warning message correctly references the log file for debugging.test/infra/control/env-isolated.bash (1)
62-70: LGTM!The port scheme calculation (
6032 + i * 10) correctly generates admin ports 6042–6122 for nodes 1–9, and theproxysql:${PORT}format aligns with the single-container architecture where all nodes are reachable on the same hostname via different ports.test/tap/groups/legacy-clickhouse/pre-proxysql.bash (1)
1-3: LGTM!The no-op hook correctly delegates cluster startup to
start-proxysql-isolated.bash, consistent with the other group hooks.test/tap/groups/legacy/pre-proxysql.bash (1)
1-3: LGTM!The hook is now a no-op, consistent with the centralized cluster startup in
start-proxysql-isolated.bash.test/tap/groups/default/pre-proxysql.bash (1)
12-19: LGTM!The cluster detection logic correctly mirrors
start-proxysql-isolated.bash, and the 10-second stabilization wait is appropriate for allowing the cluster nodes to fully initialize before tests begin.
There was a problem hiding this comment.
Code Review
This pull request refactors the ProxySQL cluster setup for CI to run all nodes within a single Docker container, which is a great improvement for reducing resource contention and improving test stability. The changes also include redirecting coverage generation logs to files and filtering coverage sources. My review focuses on improving the robustness and readability of the new bash scripts. I've identified a potential bug where a hardcoded loop count could cause failures if fewer than 3 cluster nodes are used. I've also suggested replacing a fixed sleep with a more reliable polling mechanism to wait for cluster stabilization, and some minor improvements for script idiomaticity and efficiency.
|
|
||
| # Build proxysql_servers entries: primary + first 3 nodes as core | ||
| PROXYSQL_SERVERS_SQL="DELETE FROM proxysql_servers;" | ||
| for i in $(seq 1 3); do |
There was a problem hiding this comment.
The loop hardcodes 3 as the number of core nodes. If NUM_NODES is less than 3, this script will fail later at lines 206-213 when it tries to connect to non-existent nodes. The number of core nodes should be the minimum of 3 and NUM_NODES.
A similar change is needed for the loop at line 206.
| for i in $(seq 1 3); do | |
| for i in $(seq 1 $(( NUM_NODES < 3 ? NUM_NODES : 3 )) ); do |
| for i in ${!TABLES[@]}; do | ||
| ALL_TABLES+=(${TABLES[$i]}) | ||
| ALL_TABLES+=("runtime_"${TABLES[$i]}) | ||
| done |
There was a problem hiding this comment.
| for port in ${PORTS}; do | ||
| for i in ${!ALL_TABLES[@]}; do | ||
| echo "SELECT COUNT(*) FROM ${ALL_TABLES[$i]};" | ||
| done | mysql -u admin -padmin -h 127.0.0.1 -P ${port} > /dev/null 2>&1 & | ||
| done |
There was a problem hiding this comment.
The inner loop generating SQL queries can be made more efficient and readable by using printf to generate all queries at once, avoiding the shell loop. Also, it's good practice to quote shell variables like ${port} when they are used as arguments.
for port in ${PORTS}; do
printf 'SELECT COUNT(*) FROM %s;\n' "${ALL_TABLES[@]}" | mysql -u admin -padmin -h 127.0.0.1 -P "${port}" > /dev/null 2>&1 &
done
| echo "[$(date '+%Y-%m-%d %H:%M:%S')] >>> Pre-proxysql: waiting for cluster to stabilize..." | ||
| sleep 10 |
There was a problem hiding this comment.
Using a fixed sleep can lead to flaky tests. It's better to replace it with a polling mechanism that actively checks if the cluster has stabilized. This can be done by querying the proxysql_servers table on the primary node until the expected number of cluster nodes are ONLINE.
echo "[$(date '+%Y-%m-%d %H:%M:%S')] >>> Pre-proxysql: waiting for cluster to stabilize..."
MAX_WAIT=30
COUNT=0
while [ $COUNT -lt $MAX_WAIT ]; do
# The number of nodes configured in proxysql_servers is min(3, NUM_NODES)
CORE_NODE_COUNT=$(( NUM_NODES < 3 ? NUM_NODES : 3 ))
if [ "${CORE_NODE_COUNT}" -eq 0 ]; then
break # No cluster nodes to check
fi
ONLINE_NODES=$(mysql -uradmin -pradmin -hproxysql -P6032 -NB -e "SELECT COUNT(*) FROM proxysql_servers WHERE status='ONLINE';" 2>/dev/null || echo 0)
if [ "${ONLINE_NODES}" -ge "${CORE_NODE_COUNT}" ]; then
echo " All ${CORE_NODE_COUNT} cluster nodes are ONLINE in primary's view."
break
fi
echo -n "."
sleep 1
COUNT=$((COUNT+1))
done
if [ $COUNT -ge $MAX_WAIT ]; then echo " TIMEOUT waiting for cluster to stabilize"; exit 1; fi
- RESTAPI port offset changed from 6070+i to 7070+i to avoid collision with admin/mysql ports (6072 would collide with node4 admin) - Unset GCOV_PREFIX for background cluster nodes to prevent concurrent .gcda write corruption - Cap core-node and scheduler loops to min(NUM_NODES, 3) so smaller clusters don't reference non-existent nodes - TAP_CLUSTER_NODES now respects PROXYSQL_CLUSTER_NODES and SKIP_CLUSTER_START instead of hardcoding 9 - Fix fastcov flag: -i is invalid, use --include for source filtering
TAP tests running in the test-runner container query proxysql_servers to find cluster nodes, then connect to them directly. With 127.0.0.1, the test-runner connects to itself instead of the ProxySQL container. Using 'proxysql' (the container hostname) works from both inside the container (resolves via /etc/hosts) and from the test-runner (resolves via Docker DNS on the shared network).
All noise thread failure messages now include host, port, user, and the actual error message. Previously they only logged generic "connection failure" with no details, making it impossible to diagnose why a noise thread failed.
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
test/infra/control/start-proxysql-isolated.bash (1)
87-89:⚠️ Potential issue | 🟡 MinorAvoid piping the final
exec.
execon the left side of a pipeline only replaces the pipeline child, so ProxySQL still is not PID 1 here, and its exit status is still hidden behindtee.Make ProxySQL the direct exec target
-exec /usr/bin/proxysql --idle-threads --clickhouse-server --sqlite3-server -f -c /etc/proxysql.cnf -D /var/lib/proxysql 2>&1 | tee /var/lib/proxysql/proxysql.log +exec /usr/bin/proxysql --idle-threads --clickhouse-server --sqlite3-server -f -c /etc/proxysql.cnf -D /var/lib/proxysql > >(tee /var/lib/proxysql/proxysql.log) 2>&1🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/infra/control/start-proxysql-isolated.bash` around lines 87 - 89, The current line uses exec on the left side of a pipeline so ProxySQL is not PID 1 and its exit status is obscured; change the invocation so /usr/bin/proxysql is exec'd directly and its stdout/stderr are tee'd via process substitution (e.g. use exec /usr/bin/proxysql ... -D /var/lib/proxysql > >(tee /var/lib/proxysql/proxysql.log) 2>&1) so ProxySQL becomes PID 1 and logs still go to /var/lib/proxysql/proxysql.log; also remove the stray trailing double-quote from the command.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/infra/control/start-proxysql-isolated.bash`:
- Around line 24-28: Validate the PROXYSQL_CLUSTER_NODES env before building the
inner script: ensure you read PROXYSQL_CLUSTER_NODES into NUM_NODES, but then
validate it matches a single digit 0..9 (e.g., regex '^[0-9]$') and reject/exit
with a clear error if not; keep the existing override for SKIP_CLUSTER_START
(when SKIP_CLUSTER_START == "1" or "true" set NUM_NODES=0) and ensure the
validation runs after that override so an explicit skip still yields 0, and
perform no further interpolation if validation fails to prevent out-of-range
values (e.g., 10+) from being injected into the /bin/bash -c payload.
- Around line 57-58: The global unset of GCOV_PREFIX and GCOV_PREFIX_STRIP
removes coverage env vars for the foreground ProxySQL too; revert the
unconditional "unset GCOV_PREFIX GCOV_PREFIX_STRIP" and instead remove those
variables only for background cluster node launches by using env -u when
invoking the background start command (the invocation around the current env -u
suggestion at the location where background nodes are started, referenced in the
review as the command at line ~83), leaving the primary ProxySQL start (the
process started where the script launches the foreground ProxySQL around line
~88) to inherit GCOV_PREFIX/GCOV_PREFIX_STRIP so it can write coverage to /gcov.
- Line 128: Replace POSIX test brackets with Bash conditional brackets for the
numeric comparisons using the COUNT and MAX_WAIT variables: change occurrences
like "if [ $COUNT -ge $MAX_WAIT ]", "while [ $COUNT -lt $MAX_WAIT ]" and the
other similar tests to use "[[ $COUNT -ge $MAX_WAIT ]]" / "[[ $COUNT -lt
$MAX_WAIT ]]" respectively; keep proper spacing inside the brackets and ensure
all listed occurrences (the if at the shown diff plus the while and other tests
mentioned) are updated for SonarCloud compliance.
---
Duplicate comments:
In `@test/infra/control/start-proxysql-isolated.bash`:
- Around line 87-89: The current line uses exec on the left side of a pipeline
so ProxySQL is not PID 1 and its exit status is obscured; change the invocation
so /usr/bin/proxysql is exec'd directly and its stdout/stderr are tee'd via
process substitution (e.g. use exec /usr/bin/proxysql ... -D /var/lib/proxysql >
>(tee /var/lib/proxysql/proxysql.log) 2>&1) so ProxySQL becomes PID 1 and logs
still go to /var/lib/proxysql/proxysql.log; also remove the stray trailing
double-quote from the command.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: aec70347-e0be-4373-a027-1d5339438577
📒 Files selected for processing (3)
test/infra/control/env-isolated.bashtest/infra/control/run-tests-isolated.bashtest/infra/control/start-proxysql-isolated.bash
🚧 Files skipped from review as they are similar to previous changes (2)
- test/infra/control/run-tests-isolated.bash
- test/infra/control/env-isolated.bash
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: CI-builds / builds (debian12,-dbg)
- GitHub Check: CI-builds / builds (ubuntu22,-tap)
- GitHub Check: CI-builds / builds (ubuntu24,-tap-genai-gcov)
- GitHub Check: run / trigger
- GitHub Check: claude-review
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:39-48
Timestamp: 2026-01-20T09:34:27.165Z
Learning: In ProxySQL test files (test/tap/tests/), resource leaks (such as not calling `mysql_close()` on early return paths) are not typically fixed because test processes are short-lived and the OS frees resources on process exit. This is a common pattern across the test suite.
📚 Learning: 2026-01-20T09:34:27.165Z
Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:39-48
Timestamp: 2026-01-20T09:34:27.165Z
Learning: In ProxySQL test files (test/tap/tests/), resource leaks (such as not calling `mysql_close()` on early return paths) are not typically fixed because test processes are short-lived and the OS frees resources on process exit. This is a common pattern across the test suite.
Applied to files:
test/infra/control/start-proxysql-isolated.bash
📚 Learning: 2026-03-22T14:38:16.082Z
Learnt from: CR
Repo: sysown/proxysql PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-22T14:38:16.082Z
Learning: Feature tiers are controlled by build flags: PROXYSQL31=1 for v3.1.x, PROXYSQLGENAI=1 for v4.0.x; PROXYSQLGENAI=1 implies PROXYSQL31=1
Applied to files:
test/infra/control/start-proxysql-isolated.bash
🪛 GitHub Check: SonarCloud Code Analysis
test/infra/control/start-proxysql-isolated.bash
[failure] 155-155: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.
[failure] 128-128: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.
[failure] 144-144: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.
[failure] 148-148: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.
[failure] 135-135: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.
| # Cluster configuration | ||
| NUM_NODES=${PROXYSQL_CLUSTER_NODES:-9} | ||
| if [[ "${SKIP_CLUSTER_START}" == "1" ]] || [[ "${SKIP_CLUSTER_START}" == "true" ]]; then | ||
| NUM_NODES=0 | ||
| fi |
There was a problem hiding this comment.
Validate PROXYSQL_CLUSTER_NODES before building the inner script.
This value is now env-driven and gets interpolated into the /bin/bash -c payload verbatim. It also has a hard upper bound here: NUM_NODES=10 makes MYSQL_PORT=6133, which collides with the primary pgsql port documented on Line 50. Please reject anything except integers 0..9 up front.
Suggested guard
NUM_NODES=${PROXYSQL_CLUSTER_NODES:-9}
if [[ "${SKIP_CLUSTER_START}" == "1" ]] || [[ "${SKIP_CLUSTER_START}" == "true" ]]; then
NUM_NODES=0
fi
+if ! [[ "${NUM_NODES}" =~ ^[0-9]+$ ]] || [[ "${NUM_NODES}" -gt 9 ]]; then
+ echo "Error: PROXYSQL_CLUSTER_NODES must be an integer between 0 and 9."
+ exit 1
+fi📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Cluster configuration | |
| NUM_NODES=${PROXYSQL_CLUSTER_NODES:-9} | |
| if [[ "${SKIP_CLUSTER_START}" == "1" ]] || [[ "${SKIP_CLUSTER_START}" == "true" ]]; then | |
| NUM_NODES=0 | |
| fi | |
| # Cluster configuration | |
| NUM_NODES=${PROXYSQL_CLUSTER_NODES:-9} | |
| if [[ "${SKIP_CLUSTER_START}" == "1" ]] || [[ "${SKIP_CLUSTER_START}" == "true" ]]; then | |
| NUM_NODES=0 | |
| fi | |
| if ! [[ "${NUM_NODES}" =~ ^[0-9]+$ ]] || [[ "${NUM_NODES}" -gt 9 ]]; then | |
| echo "Error: PROXYSQL_CLUSTER_NODES must be an integer between 0 and 9." | |
| exit 1 | |
| fi |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/infra/control/start-proxysql-isolated.bash` around lines 24 - 28,
Validate the PROXYSQL_CLUSTER_NODES env before building the inner script: ensure
you read PROXYSQL_CLUSTER_NODES into NUM_NODES, but then validate it matches a
single digit 0..9 (e.g., regex '^[0-9]$') and reject/exit with a clear error if
not; keep the existing override for SKIP_CLUSTER_START (when SKIP_CLUSTER_START
== "1" or "true" set NUM_NODES=0) and ensure the validation runs after that
override so an explicit skip still yields 0, and perform no further
interpolation if validation fails to prevent out-of-range values (e.g., 10+)
from being injected into the /bin/bash -c payload.
| # Disable gcov for background cluster nodes to avoid concurrent .gcda writes | ||
| unset GCOV_PREFIX GCOV_PREFIX_STRIP |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n test/infra/control/start-proxysql-isolated.bash | head -100Repository: sysown/proxysql
Length of output: 4198
Keep GCOV_PREFIX* for the foreground ProxySQL.
The unset at lines 57–58 is shell-wide within the startup script, so the primary ProxySQL process started at line 88 inherits no GCOV_PREFIX/GCOV_PREFIX_STRIP, disabling coverage collection for the instance that should write data to /gcov.
Limit the unset to background cluster nodes only by using env -u at line 83 instead:
Suggested change
-# Disable gcov for background cluster nodes to avoid concurrent .gcda writes
-unset GCOV_PREFIX GCOV_PREFIX_STRIP
-
# Start cluster nodes as background processes
for i in $(seq 1 ${NUM_NODES}); do
ADMIN_PORT=$((6032 + i * 10))
@@ -80,1 +79,2 @@
- /usr/bin/proxysql --idle-threads -f -c ${NODE_DIR}/proxysql-node.cnf -D ${NODE_DIR} >> ${NODE_DIR}/proxysql.log 2>&1 &
+ env -u GCOV_PREFIX -u GCOV_PREFIX_STRIP \
+ /usr/bin/proxysql --idle-threads -f -c ${NODE_DIR}/proxysql-node.cnf -D ${NODE_DIR} >> ${NODE_DIR}/proxysql.log 2>&1 &🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/infra/control/start-proxysql-isolated.bash` around lines 57 - 58, The
global unset of GCOV_PREFIX and GCOV_PREFIX_STRIP removes coverage env vars for
the foreground ProxySQL too; revert the unconditional "unset GCOV_PREFIX
GCOV_PREFIX_STRIP" and instead remove those variables only for background
cluster node launches by using env -u when invoking the background start command
(the invocation around the current env -u suggestion at the location where
background nodes are started, referenced in the review as the command at line
~83), leaving the primary ProxySQL start (the process started where the script
launches the foreground ProxySQL around line ~88) to inherit
GCOV_PREFIX/GCOV_PREFIX_STRIP so it can write coverage to /gcov.
| sleep 1 | ||
| COUNT=$((COUNT+1)) | ||
| done | ||
| if [ $COUNT -ge $MAX_WAIT ]; then echo " TIMEOUT"; exit 1; fi |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check if the file exists and read the relevant lines
if [ -f "test/infra/control/start-proxysql-isolated.bash" ]; then
echo "=== File found, reading content ==="
wc -l "test/infra/control/start-proxysql-isolated.bash"
echo ""
echo "=== Lines 128, 135, 144, 148, 155 ==="
sed -n '128p; 135p; 144p; 148p; 155p' "test/infra/control/start-proxysql-isolated.bash"
echo ""
echo "=== Context around line 128 ==="
sed -n '125,160p' "test/infra/control/start-proxysql-isolated.bash"
else
echo "File not found"
fiRepository: sysown/proxysql
Length of output: 1846
🏁 Script executed:
# Search for all [ conditionals in the file to check if there are others
echo "=== All [ conditional patterns in the file ==="
rg '\[\s' "test/infra/control/start-proxysql-isolated.bash" -nRepository: sysown/proxysql
Length of output: 698
Switch the [ tests to [[ for unquoted numeric variable comparisons to pass SonarCloud linting.
The identified changes at lines 128, 135, 144, 148, and 155 are accurate. Using [[ ... ]] instead of [ ... ] with unquoted numeric variables ($COUNT) is both SonarCloud-compliant and idiomatic Bash, with no behavioral change.
Note: Line 115 contains the same pattern (while [ $COUNT -lt $MAX_WAIT ]) and should also be converted to [[ ... ]] for consistency.
Sonar fix
-while [ $COUNT -lt $MAX_WAIT ]; do
+while [[ $COUNT -lt $MAX_WAIT ]]; do
@@
-if [ $COUNT -ge $MAX_WAIT ]; then echo " TIMEOUT"; exit 1; fi
+if [[ $COUNT -ge $MAX_WAIT ]]; then echo " TIMEOUT"; exit 1; fi
@@
- while [ $COUNT -lt $MAX_WAIT ]; do
+ while [[ $COUNT -lt $MAX_WAIT ]]; do
@@
- if [ $COUNT -ge $MAX_WAIT ]; then echo " TIMEOUT (node ${i})"; exit 1; fi
+ if [[ $COUNT -ge $MAX_WAIT ]]; then echo " TIMEOUT (node ${i})"; exit 1; fi
@@
-if [ "${NUM_NODES}" -gt 0 ]; then
+if [[ "${NUM_NODES}" -gt 0 ]]; then
@@
- if [ "${NUM_NODES}" -lt 3 ]; then CORE_NODES="${NUM_NODES}"; fi
+ if [[ "${NUM_NODES}" -lt 3 ]]; then CORE_NODES="${NUM_NODES}"; fi🧰 Tools
🪛 GitHub Check: SonarCloud Code Analysis
[failure] 128-128: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/infra/control/start-proxysql-isolated.bash` at line 128, Replace POSIX
test brackets with Bash conditional brackets for the numeric comparisons using
the COUNT and MAX_WAIT variables: change occurrences like "if [ $COUNT -ge
$MAX_WAIT ]", "while [ $COUNT -lt $MAX_WAIT ]" and the other similar tests to
use "[[ $COUNT -ge $MAX_WAIT ]]" / "[[ $COUNT -lt $MAX_WAIT ]]" respectively;
keep proper spacing inside the brackets and ensure all listed occurrences (the
if at the shown diff plus the while and other tests mentioned) are updated for
SonarCloud compliance.
Test now prints a description of what it does and numbered steps: 1. Connect to admin 2. Disable cluster scheduler 3. Measure idle CPU baseline 4. Trigger BUSY_LOOP (disconnect during query) 5. Measure CPU after BUSY_LOOP 6. Wait for cleanup 7. Verify CPU returned to idle 8. Trigger INF_LOOP (disconnect after response) 9. Measure CPU after INF_LOOP 10. Recover cluster scheduler
Cluster nodes only configured mysql admin and data ports, leaving
pgsql (6133/6132) and sqlite (6030) at defaults — same as the
primary. All nodes tried to bind the same ports, causing:
1. Bind failures for pgsql on cluster nodes
2. Cluster sync overwriting primary's pgsql config with empty tables
(nodes had no working pgsql, so their pgsql tables were empty)
Now each node gets unique ports for all interfaces:
node N: mysql admin=6032+N*10, mysql data=6033+N*10,
pgsql data=6133+N*10, pgsql admin=6132+N*10,
sqlite=6030+N*10
There was a problem hiding this comment.
♻️ Duplicate comments (3)
test/infra/control/start-proxysql-isolated.bash (3)
56-58:⚠️ Potential issue | 🟠 MajorThe global
unsetdisables coverage for the primary ProxySQL process.The
unset GCOV_PREFIX GCOV_PREFIX_STRIPat lines 57-58 runs before the primary ProxySQL starts at line 99, so the primary instance won't write coverage data to/gcov. The intent was to disable coverage only for background cluster nodes.Fix: use `env -u` only for background node launches
Remove lines 57-58 and modify line 94 to unset the variables only for background processes:
-# Disable gcov for background cluster nodes to avoid concurrent .gcda writes -unset GCOV_PREFIX GCOV_PREFIX_STRIP - # Start cluster nodes as background processes for i in \$(seq 1 ${NUM_NODES}); do ... - /usr/bin/proxysql --idle-threads -f -c \${NODE_DIR}/proxysql-node.cnf -D \${NODE_DIR} >> \${NODE_DIR}/proxysql.log 2>&1 & + env -u GCOV_PREFIX -u GCOV_PREFIX_STRIP /usr/bin/proxysql --idle-threads -f -c \${NODE_DIR}/proxysql-node.cnf -D \${NODE_DIR} >> \${NODE_DIR}/proxysql.log 2>&1 &🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/infra/control/start-proxysql-isolated.bash` around lines 56 - 58, The global unset of GCOV_PREFIX and GCOV_PREFIX_STRIP inside the STARTUP_CMD removes coverage for the primary ProxySQL process; remove the lines that unset these variables from STARTUP_CMD and instead apply "env -u" when spawning background cluster nodes (the commands that launch background instances), so only background node launches have GCOV_PREFIX/GCOV_PREFIX_STRIP removed while the primary process retains coverage.
24-28:⚠️ Potential issue | 🟠 MajorAdd input validation for
NUM_NODESto prevent port collisions.If
PROXYSQL_CLUSTER_NODESis set to 10 or higher, the calculated ports will collide with primary pgsql ports (e.g., node 10: admin=6132 collides with pgsql admin). Non-integer values would also cause arithmetic errors.Suggested validation
# Cluster configuration NUM_NODES=${PROXYSQL_CLUSTER_NODES:-9} if [[ "${SKIP_CLUSTER_START}" == "1" ]] || [[ "${SKIP_CLUSTER_START}" == "true" ]]; then NUM_NODES=0 fi +if ! [[ "${NUM_NODES}" =~ ^[0-9]$ ]]; then + echo "Error: PROXYSQL_CLUSTER_NODES must be an integer between 0 and 9." + exit 1 +fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/infra/control/start-proxysql-isolated.bash` around lines 24 - 28, The NUM_NODES assignment using PROXYSQL_CLUSTER_NODES lacks validation and can accept non-integers or values >=10 causing port collisions; add input validation after NUM_NODES is set (and after honoring SKIP_CLUSTER_START) to ensure NUM_NODES is an integer between 0 and 9 inclusive, and if invalid print a clear error to stderr and exit with non-zero status. Validate the value referenced by PROXYSQL_CLUSTER_NODES (and the resulting NUM_NODES) using a numeric check and range check, and handle non-integer input gracefully (error+exit) rather than letting bash arithmetic or port calculation proceed; update any error messages to reference NUM_NODES/PROXYSQL_CLUSTER_NODES and SKIP_CLUSTER_START for clarity.
98-100:⚠️ Potential issue | 🟡 MinorThe
exec ... | teepattern prevents proper signal propagation to the ProxySQL process.When
execis combined with a pipe, the shell creates a subshell for the pipeline, soexecreplaces only the subshell rather than the main shell process. Signals sent to PID 1 may not reach ProxySQL directly.Alternative: use process substitution to preserve exec semantics
-exec /usr/bin/proxysql --idle-threads --clickhouse-server --sqlite3-server -f -c /etc/proxysql.cnf -D /var/lib/proxysql 2>&1 | tee /var/lib/proxysql/proxysql.log +exec /usr/bin/proxysql --idle-threads --clickhouse-server --sqlite3-server -f -c /etc/proxysql.cnf -D /var/lib/proxysql > >(tee /var/lib/proxysql/proxysql.log) 2>&1🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/infra/control/start-proxysql-isolated.bash` around lines 98 - 100, The current line uses "exec /usr/bin/proxysql ... | tee /var/lib/proxysql/proxysql.log", which breaks signal propagation because exec is used with a pipe; change it to preserve exec semantics by redirecting ProxySQL's stdout/stderr into a process substitution instead of a pipeline (so PID 1 becomes the proxysql binary). Update the invocation that references /usr/bin/proxysql and /var/lib/proxysql/proxysql.log to use process substitution (or an equivalent direct redirection) so signals delivered to PID 1 are received by the ProxySQL process.
🧹 Nitpick comments (3)
test/infra/control/start-proxysql-isolated.bash (3)
126-139: Switch[to[[for SonarCloud compliance.SonarCloud flags the use of single brackets for conditional tests. Using
[[is safer and more feature-rich in Bash.Suggested fix
-while [ $COUNT -lt $MAX_WAIT ]; do +while [[ $COUNT -lt $MAX_WAIT ]]; do ... done -if [ $COUNT -ge $MAX_WAIT ]; then echo " TIMEOUT"; exit 1; fi +if [[ $COUNT -ge $MAX_WAIT ]]; then echo " TIMEOUT"; exit 1; fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/infra/control/start-proxysql-isolated.bash` around lines 126 - 139, Replace POSIX-style test brackets with Bash extended tests: change the while and final if conditions that use [ $COUNT -lt $MAX_WAIT ] and [ $COUNT -ge $MAX_WAIT ] to use [[ $COUNT -lt $MAX_WAIT ]] and [[ $COUNT -ge $MAX_WAIT ]], and also ensure the inner if that checks the docker exec command uses [[ ... ]] form for its conditional (keeping the same command invocation with PROXY_CONTAINER, COUNT, MAX_WAIT variables and the docker exec mysql calls unchanged).
159-166: Switch[to[[in cluster initialization conditionals for SonarCloud compliance.Lines 159 and 166 use single brackets which SonarCloud flags.
Suggested fix
-if [ "${NUM_NODES}" -gt 0 ]; then +if [[ "${NUM_NODES}" -gt 0 ]]; then echo ">>> Initializing ProxySQL Cluster (${NUM_NODES} nodes)..." ... CORE_NODES=3 - if [ "${NUM_NODES}" -lt 3 ]; then CORE_NODES="${NUM_NODES}"; fi + if [[ "${NUM_NODES}" -lt 3 ]]; then CORE_NODES="${NUM_NODES}"; fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/infra/control/start-proxysql-isolated.bash` around lines 159 - 166, Replace POSIX test brackets with Bash conditional syntax to satisfy SonarCloud: update the conditionals that use NUM_NODES (the checks controlling cluster initialization and CORE_NODES assignment) from if [ "${NUM_NODES}" -gt 0 ]; then and if [ "${NUM_NODES}" -lt 3 ]; then to use [[ ... ]] (e.g., if [[ "${NUM_NODES}" -gt 0 ]]; then and if [[ "${NUM_NODES}" -lt 3 ]]; then), leaving the surrounding logic (MYSQL_CMD, CORE_NODES assignment) unchanged.
141-156: Switch[to[[in cluster node wait loop for SonarCloud compliance.Lines 146 and 155 use single brackets which SonarCloud flags.
Suggested fix
- while [ $COUNT -lt $MAX_WAIT ]; do + while [[ $COUNT -lt $MAX_WAIT ]]; do ... done - if [ $COUNT -ge $MAX_WAIT ]; then echo " TIMEOUT (node ${i})"; exit 1; fi + if [[ $COUNT -ge $MAX_WAIT ]]; then echo " TIMEOUT (node ${i})"; exit 1; fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/infra/control/start-proxysql-isolated.bash` around lines 141 - 156, Replace the POSIX test brackets with Bash conditional brackets in the cluster wait loop: change the while and if tests that currently use [ $COUNT -lt $MAX_WAIT ] and [ $COUNT -ge $MAX_WAIT ] to use [[ ... ]] (e.g., [[ $COUNT -lt $MAX_WAIT ]] and [[ $COUNT -ge $MAX_WAIT ]]) for SonarCloud compliance and safer evaluation of variables like COUNT and MAX_WAIT in the loop that references NUM_NODES, ADMIN_PORT, PROXY_CONTAINER, COUNT and MAX_WAIT.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@test/infra/control/start-proxysql-isolated.bash`:
- Around line 56-58: The global unset of GCOV_PREFIX and GCOV_PREFIX_STRIP
inside the STARTUP_CMD removes coverage for the primary ProxySQL process; remove
the lines that unset these variables from STARTUP_CMD and instead apply "env -u"
when spawning background cluster nodes (the commands that launch background
instances), so only background node launches have GCOV_PREFIX/GCOV_PREFIX_STRIP
removed while the primary process retains coverage.
- Around line 24-28: The NUM_NODES assignment using PROXYSQL_CLUSTER_NODES lacks
validation and can accept non-integers or values >=10 causing port collisions;
add input validation after NUM_NODES is set (and after honoring
SKIP_CLUSTER_START) to ensure NUM_NODES is an integer between 0 and 9 inclusive,
and if invalid print a clear error to stderr and exit with non-zero status.
Validate the value referenced by PROXYSQL_CLUSTER_NODES (and the resulting
NUM_NODES) using a numeric check and range check, and handle non-integer input
gracefully (error+exit) rather than letting bash arithmetic or port calculation
proceed; update any error messages to reference NUM_NODES/PROXYSQL_CLUSTER_NODES
and SKIP_CLUSTER_START for clarity.
- Around line 98-100: The current line uses "exec /usr/bin/proxysql ... | tee
/var/lib/proxysql/proxysql.log", which breaks signal propagation because exec is
used with a pipe; change it to preserve exec semantics by redirecting ProxySQL's
stdout/stderr into a process substitution instead of a pipeline (so PID 1
becomes the proxysql binary). Update the invocation that references
/usr/bin/proxysql and /var/lib/proxysql/proxysql.log to use process substitution
(or an equivalent direct redirection) so signals delivered to PID 1 are received
by the ProxySQL process.
---
Nitpick comments:
In `@test/infra/control/start-proxysql-isolated.bash`:
- Around line 126-139: Replace POSIX-style test brackets with Bash extended
tests: change the while and final if conditions that use [ $COUNT -lt $MAX_WAIT
] and [ $COUNT -ge $MAX_WAIT ] to use [[ $COUNT -lt $MAX_WAIT ]] and [[ $COUNT
-ge $MAX_WAIT ]], and also ensure the inner if that checks the docker exec
command uses [[ ... ]] form for its conditional (keeping the same command
invocation with PROXY_CONTAINER, COUNT, MAX_WAIT variables and the docker exec
mysql calls unchanged).
- Around line 159-166: Replace POSIX test brackets with Bash conditional syntax
to satisfy SonarCloud: update the conditionals that use NUM_NODES (the checks
controlling cluster initialization and CORE_NODES assignment) from if [
"${NUM_NODES}" -gt 0 ]; then and if [ "${NUM_NODES}" -lt 3 ]; then to use [[ ...
]] (e.g., if [[ "${NUM_NODES}" -gt 0 ]]; then and if [[ "${NUM_NODES}" -lt 3 ]];
then), leaving the surrounding logic (MYSQL_CMD, CORE_NODES assignment)
unchanged.
- Around line 141-156: Replace the POSIX test brackets with Bash conditional
brackets in the cluster wait loop: change the while and if tests that currently
use [ $COUNT -lt $MAX_WAIT ] and [ $COUNT -ge $MAX_WAIT ] to use [[ ... ]]
(e.g., [[ $COUNT -lt $MAX_WAIT ]] and [[ $COUNT -ge $MAX_WAIT ]]) for SonarCloud
compliance and safer evaluation of variables like COUNT and MAX_WAIT in the loop
that references NUM_NODES, ADMIN_PORT, PROXY_CONTAINER, COUNT and MAX_WAIT.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 803d9120-c8a9-431a-91c8-a489784d68f7
📒 Files selected for processing (1)
test/infra/control/start-proxysql-isolated.bash
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: CI-builds / builds (debian12,-dbg)
- GitHub Check: CI-builds / builds (ubuntu22,-tap)
- GitHub Check: CI-builds / builds (ubuntu24,-tap-genai-gcov)
- GitHub Check: run / trigger
- GitHub Check: claude-review
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:39-48
Timestamp: 2026-01-20T09:34:27.165Z
Learning: In ProxySQL test files (test/tap/tests/), resource leaks (such as not calling `mysql_close()` on early return paths) are not typically fixed because test processes are short-lived and the OS frees resources on process exit. This is a common pattern across the test suite.
📚 Learning: 2026-01-20T09:34:27.165Z
Learnt from: yuji-hatakeyama
Repo: sysown/proxysql PR: 5307
File: test/tap/tests/reg_test_5306-show_warnings_with_comment-t.cpp:39-48
Timestamp: 2026-01-20T09:34:27.165Z
Learning: In ProxySQL test files (test/tap/tests/), resource leaks (such as not calling `mysql_close()` on early return paths) are not typically fixed because test processes are short-lived and the OS frees resources on process exit. This is a common pattern across the test suite.
Applied to files:
test/infra/control/start-proxysql-isolated.bash
📚 Learning: 2026-03-22T14:38:16.082Z
Learnt from: CR
Repo: sysown/proxysql PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-22T14:38:16.082Z
Learning: Feature tiers are controlled by build flags: PROXYSQL31=1 for v3.1.x, PROXYSQLGENAI=1 for v4.0.x; PROXYSQLGENAI=1 implies PROXYSQL31=1
Applied to files:
test/infra/control/start-proxysql-isolated.bash
🪛 GitHub Check: SonarCloud Code Analysis
test/infra/control/start-proxysql-isolated.bash
[failure] 139-139: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.
[failure] 146-146: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.
[failure] 155-155: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.
[failure] 166-166: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.
[failure] 159-159: Use '[[' instead of '[' for conditional tests. The '[[' construct is safer and more feature-rich.
🔇 Additional comments (5)
test/infra/control/start-proxysql-isolated.bash (5)
1-22: LGTM!Standard script initialization with proper error handling (
set -e,set -o pipefail), path derivation, and required variable validation.
30-42: LGTM!Coverage directory setup, network creation, and data directory preparation are correctly implemented for CI isolation.
102-120: LGTM!Docker container configuration with proper volume mounts, environment variables, and network setup is correct.
186-229: LGTM!Node configuration and scheduler installation logic is correct. The use of
CORE_NODESensures scheduler is only installed on the appropriate subset of nodes, and RESTAPI ports are uniquely assigned.
231-234: LGTM!Clear completion messages for cluster initialization and startup confirmation.
|



Summary
Reduces Docker container count and improves CI test stability by running all ProxySQL cluster nodes as background processes inside a single container, plus several related fixes.
Single-container cluster (main change)
Instead of spawning 1+9 Docker containers per group for the ProxySQL cluster, all instances now run as background processes inside the primary container. Each node gets its own port pair:
This eliminates 9 Docker containers per group, reducing Docker networking/iptables contention that caused "no route to host" (errno 113) errors under parallel test execution.
Other changes
coverage-generation.loginstead of cluttering stdout (bothrun-tests-isolated.bashandrun-multi-group.bash)include/,lib/,src/,test/) instead of system headers and vendored depslegacy-clickhouse-g1group, removedinfra-clickhouse23from legacy infras (fewer containers per legacy group)AUTO_CLEANUPdefaults to 1 — successful groups clean up containers automaticallylegacy/pre-proxysql.bashanddefault/pre-proxysql.bashare now no-ops (cluster handled bystart-proxysql-isolated.bash)Test plan
run-multi-group.bashwith legacy groups and verify cluster starts inside single containerlegacy-clickhouse-g1groupSummary by CodeRabbit
New Features
Improvements
Tests